Estimation of Kramers-Moyal coefficients at low sampling rates 
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A new optimization procedure for the estimation of Kramers-Moyal coefficients from stationary, 
one-dimensional, Markovian time series data is presented. The method takes advantage of a recently 
reported approach that allows to calculate exact finite sampling interval effects by solving the 
adjoint Fokker-Planck equation. Therefore it is well suited for the analysis of sparsely sampled time 
series. The optimization can be performed either making a parametric ansatz for drift and diffusion 
functions or also parameter free. We demonstrate the power of the method in several numerical 
examples with synthetic time series. 
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I. INTRODUCTION 

The behavior of complex systems consisting of a large 
number of degrees of freedom can often be described by 
low dimensional macroscopic order parameter equations 
[l[ • Thereby the influence of the microscopic degrees of 
freedom is treated via noise terms of Langevin type [H . 
In case of a single order parameter q(t) its time evolution 
can be described by 



q = h(q,t) + g(q,t)T(t) 



(1) 



where T(t) is a Gaussian distributed white noise term 
satisfying (T(t)) = and (T(t)T(t')) = 5(t-tf). Here and 
in the following Ito's interpretation of stochastic integrals 
is used 

The same information is contained in the correspond- 
ing Fokker-Planck equation (FPE) for the probability 
density function of f q (x, t) 



at 



L(x,t)f q (x,t) . 



(2) 



Here we have introduced the Fokker-Planck operator 

L(x,t) = -±. D V>(x,t) + ^D^(x,t) (3) 
which contains the Kramers-Moyal (KM) coefficients 



D {n) (x,t) 



limip + r)-«(f 
t->o n'.T 



\q(t)-- 



(4) 



also referred to as drift and diffusion for n — 1 and n = 2, 
respectively. The connection to the functions g and h in 
Eq. (flj is h(x,t) = jJland g(x,t) = ^/2D i ~ T >{x,t). 

As was recently shown [3|, |j] , it is possible to set up an 
equation of the form ([1]) by estimating the conditional 
averages in Q from a data set of the variable q(t). This 
method was applied in various fields of science, see Ref. 
5] for an overview. 
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There are two major problems connected to the esti- 
mation of drift and diffusion coefficients from measured 
"real world" time series. The first problem consists in 
the occurence of measurement noise. In Ref. [6] it was 
shown that measurement noise spoils the Markov prop- 
erty, the latter being a requirement for the KM analysis. 
A promising approach to handle Gaussian distributed ex- 
ponentially correlated measurement noise was recently 
proposed by Lehle 

The other problem in the Kramers-Moyal analysis is 
that one has to perform the limit r — > 0, while data 
sets are recorded at finite sampling intervals. Also in 
real world processes the intrinsic noise is not strictly 6- 
correlated, which results in a finite Markov-Einstein time, 
i. e., a finite time interval tme such that for time intervals 
t < tme the Markov property does no longer hold. It is 
observed that in case of a finite Markov-Einstein time, 
the KM coefficients go to zero with decreasing time in- 
terval r. 

Ragwitz and Kantz |8] were the first who presented a 
formular to estimate the KM coefficients that takes into 
account finite sampling interval effects at first order in the 
sampling interval. In a comment on this article Friedrich 
et al. 9J presented correction terms in form of an infinite 
series expansion in the sampling interval. Very recently 
Antenedo et al. presented exact analytical expressions for 
the finite time KM coefficients (s. Eq. ((SJ) for processes 
with linear drift and quadratic diffusion |1G | and later for 
other common processes (Tl| . 

A very elegant way to obtain finite time KM coeffi- 
cients for arbitrary (but sufficiently smooth) drift and 
diffusion terms was recently presented by Lade [12(. He 
reinterpreted the series expansion presented in [9( in a 
way that finite time coefficients can be obtained by solv- 
ing the adjoint Fokker-Planck equation. Since this can 
be done at least numerically, Lades method opens up 
the possibility to deduce the true KM coefficients from 
measured finite time coefficients by an optimization ap- 
proach. This is the topic of the present work. 

Of course, finite time KM coefficients can also be ob- 
tained by simulating Langevin equations and measuring 
the conditional moments at a finite r. This was done 
in the iterative method developed by Kleinhans et al. 
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[13l Il4| . But since this is numerically very demanding, 
the method was only applied to situations where very few 
parameters had to be optimized. In this article we show 
that an optimization based on Lades method can even be 
performed without a parametric ansatz for drift and dif- 
fusion coefficients, which should make it more applicable 
for a larger class of diffusion processes. 

In the next section we review the method of Lade [lJJ] 
that allows for a calculation of exact finite time effects. 
The following section gives a description of our new op- 
timization procedure. Section IIVI contains four numeri- 
cal examples in which the functionality of our method is 
demonstrated. 



II. EXACT FINITE SAMPLING INTERVAL 
EFFECTS 

From now on we assume the Langevin process of in- 
terest to be stationary, i.e., drift and diffusion do not 
explicitly depend on time. We define the finite time co- 
efficients as 

D^\x) = \m^)(x) (5) 
with the conditional moments 

/oo 
(x' -x) n p(x',t + r\x,t)dx' . (6) 
-oo 

The conditional probability density function p(x',t + 
t\x, t) is the solution of the corresponding FPE with the 
initial condition 5(x' — x), so it can be expressed as 



p(x', t + t\x, t) = e L< - x "> T S{x' - x) . 
Inserting this in ((6]) results in 

A4 n \x) = ((x' - x) n \e Ux ' )T \8{x' - x)) 
= (e L ^ x '>(x' -x) n \S(x' -x)) 

= e L ^ x ' )T ( x ' - x y 



(7) 



(8) 



X — X 

f oo 



where we use the notation (f\g) = J_ f(x')g(x')dx' for 

the inner product. Iy is the adjoint Fokker-Planck oper- 
ator 



(9) 



The main point of Lade's article 12j is to interpret eq. 
([8]) as the solution to the partial differential equation 

dW n , x (x',t) _ fu , )w , , v 

dt -Hx)W n ^x,t) (1Q) 

W n , x (x',0) = (x'-x) n 



at t = r, x = .To, i. e.: 

M^(x) = W n , x (x, T ) . 



For simple drift and diffusion coefficients, eq. (TlCfl) can 
be solved analytically. E.g., for an Ornstein-Uhlenbeck 
process with D^{x) = —72; and D^(x) = D, one 
obtains LUJ 



W hx (x',t) ^x'e-^ -x 
W 2 , x (x',t) = (x'e-^-xf 

With Eq. © and (HD we get 
=-- (l-e-^) 



D?\x) = l 



D 



D 



1-c 



-2 7 T\ 



(12) 
(13) 

(14) 
(15) 



A process with linear drift (x) = —^x and quadratic 
diffusion D^{x) = a + (3x 2 gives the same finite time 
drift as for the Ornstein-Uhlenbeck process. For the dif- 
fusion we obtain 



W 2 , x {x',t) = x 2 --^ 1-e 2 ^' 
/3 - 7 V 



(16) 



which leads to 

4 2) (*) = 



1 r 

27 



0-7 



1 



3 2(/3- 7 )i 



(17) 



If an analytical solution cannot be obtained one has to 
solve eq. (TTUl) numerically up to t = r for all x values of 
interest. 

An alternative way to calculate finite time effects 
would be to solve the real FPE, instead of the adjoint 
FPE, which yields the whole transition pdf. But this 
would involve a Dirac 5-function as an initial condition 
which is expected to cause numerical problems. The ad- 
joint FPE can be easily solved via a simple forward-time 
centered-space scheme. For the spatial derivatives on the 
left and right boundaries we use second order forward and 
backward finite differences, respectively. 



III. THE OPTIMIZATION PROCEDURE 

The first step of the optimization is to estimate 
the conditional moments ^ for a set of r values 
{ti,...,tm}, Ti < Ti+i, and a set of x values 
{x\, . . . , xjv}, Xi < Xj+i. The latter should be the same 
values that are later on used for the numerical integra- 
tion of the adjoint FPE. In a histogram based regression 
the size of the bins located at Xi is limited through the 
available amount of data. Therefore a kernel based re- 
gression as described in [15j is favorable which results in 

a smooth curve. We denote the estimated conditional 

~ (1 2) 

moments by Mf 4 ' (Xj). It is also important to calculate 
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statistical errors a. 
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FIG. 1: (Color online) Illustration of the optimization proce- 
dure. The red dots in both graphs show the estimated finite 
time coefficients D T 2 \x) = M T 2 \x) / (2t) for the example of 
Sec. II V CI The blue surface in the top panel corresponds to 
Dt (x, o"im). By minimizing V(a) we seek a set of parameters 



a such that Dt 1 ' 2 ^ (x, a) conforms to Dt 1 '*' (x), respectively, as 
is the case for the diffusion in the lower panel. 



(1,2), 



The optimization can be performed with or without 
the use of parameterized drift and diffusion functions. In 
the former case one has to embed the drift and diffu- 
sion functions into a family of functions D^'(x, a) and 
-D*- 2 -* (x, a), respectively, with a set of parameters denoted 
by a. 

In the latter case one has to define a set of sampling 
points {x\ , . . . , x s K }, K < N, and represent and 
as a spline interpolation through these sampling 
points. Then the set of parameters to be optimized 
is a = {DW(xD, . . . , £> (1) 04r), D i2) (x s i), D (2) (x s K )}. 



In both cases and DY^ can be used to construct an 
initial guess <7i n i. 

For a specific set of parameters a, the conditional mo- 
ments © can be calculated as described in sec. HH yield- 

(12) 

mg Since these computations are to be 

performed for each Xj individually, it is very easy and 
efficient to parallelize this part for the use on parallel 
computers. 

The final step is to find the minimum of the least 



l(2) 



r (l) 



{M^\x j )-M^\x j ,(j)y 



(» 



(18) 



Fig. [T] illustrates the idea of this procedure. 

For the optimization we use a trust region algorithm 
[l6j |. It turns out that for large sampling intervals ri, 
the best results are achieved, when only that single n 
is used, i.e. M = 1 in Eq. (|T8)) . For smaller sampling 
intervals the accuracy can be improved by the use of more 
t values. 

After the optimization procedure has converged to a 
certain set of parameters cr res , one can perform a self- 
consistency check by comparing graphically the functions 
D T 1,2 \x, a Tes ) and D T 1,2 \x) as in Fig. [1] 



IV. NUMERICAL EXAMPLES 

A. Ornstein-Uhlenbeck process 

As a first numerical example we consider an Ornstein- 
Uhlenbeck process with D^'ix) = —x and D^ 2 \x) = 1. 
A synthetic time series with 10 7 data points is com- 
puted using a forward Euler scheme with a time step 
At = 10~ 3 , but only every 1000th time step is stored. 
So the minimal time increment, that is available for the 
data analysis, is t\ = 1. The symbols with the error 
bars in Fig. [5] show the estimated finite time coefficients 
D T \^ (x) (top) and {x) (bottom) . From this it seems 
reasonable to make the parametric ansatz D^(x) = —ax 
and D^ 2 \x) = b + cx 2 . As an initial guess, we choose 
a ini = 0.63, bini = 0.43 and Q n i = 0.2. The correspond- 
ing curves are shown in blue in Fig. [2j The result- 
ing parameters from the optimization 0.9966, 
bics = 0.9995 and c rcs = 0.00032. These values corre- 
spond to the red curves in Fig. [21 For comparison we 
also plot the black dots which correspond to the true 
parameters a = 1, b = 1 and c = 0. 



B. Multiplicative noise 

The next example is a system with multiplicative 
noise, i.e., the diffusion term depends on x. We choose 
D^(x) — —x and D^ 2 \x) = 1 + x 2 . In the same manner 
as in the previous example, we construct a time series 
with 10 8 data points and a sampling interval t\ = 1. 
From the estimated finite time coefficients for D T \' 2 ^ we 
again deduce the parametric ansatz D^'(x) — —ax and 
D^ 2 \x) = b + cx 2 and take as an initial guess ai n ; = 0.63, 
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FIG. 2: (Color online) Results for an Ornstein-Uhlenbeck pro- 
cess with D^\x) — —x, D^ 2 \x) = 1. The analyzed time 
series consists of 10 r data points with a sampling interval 
Ti = 1. The blue crosses with error bars are the estimated fi- 
nite time coefficients (top) and (bottom). The blue 
dotted curves show the initial guesses for the optimization, 
the red solid ones show the result. For comparison also the 
true coefficients are plotted (black dots). 



FIG. 3: (Color online) System with multiplicative noise: 
D"' (x) = — x, D' 2 '(x) = I + x 2 . The analyzed time series 
consists of 10 s data points with a sampling interval n = 1. 
The representation is analog to Fig. [2] 



C. Bistable system 



&ini = 1-0 and Cj n j = 0.7. The finite time coefficients as 
well as the initial condition are depicted in blue in Fig. 
([3]). From the optimization we obtain the parameters 
a rcs = 0.9989, & res = 1.004 and c res = 0.9963. The corre- 
sponding curves are shown in red in Fig. Q as well as 
the true coefficients with a = b = c = 1 (black dots) . 



The first example for a parameter free optimization is 
a bistable system with D^> (x) = x — x 3 and D^ 2 \x) = 1 . 
The blue dots in Fig. ^ correspond to the finite time co- 
efficients D y Tl ' '. They are used as the initial guess for the 

parameters a to be optimized. The terms Mt ( ' (xj , c) 
in Eq. (|18p are now calculated by a spline interpolation 
between these sampling points. They are shown in blue 
in Fig. Q . The resulting parameters that minimize (fT8|) 
are the red squares from which the red spline curves are 
calculated. The latter represent the resulting drift and 
diffusion coefficients. 
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FIG. 4: (Color online) Bistable system with D m (x) = x-x 3 , 
D^ 2 \x) = 1. The analyzed time series consists of 10 7 data 
points with a sampling interval ri = 0.1. The blue and red 
symbols are sampling points, from which the corresponding 
curves are computed as spline interpolations, and serve as 
optimization parameters. The blue dots represent the initial 
condition derived from the finite time coefficients Dr x ' . The 
red squares show the result of the optimization. The black 
dashed curves show the true coefficients for comparison. 



D. Phase dynamics 

As a last example, we consider a phase variable <j> that 
can also be a phase difference 4> = 4>\ — 4>2 between 
two coupled nonlinear oscillators. The reconstruction of 
phase dynamics from data sets is an important theoret- 
ical problem that is relevant in many different fields of 
science. The problem was among others tackled by Krale- 
mann et al. [17| . We suggest the KM approach as a less 
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FIG. 5: (Color online) Phase dynamics with D"' (x) = 
0.2 + cos(x), D' 2 '(x) = 0.5. The analyzed time series con- 
sists of 10 7 data points with a sampling interval ri = 1. The 
representation is analog to Fig. U 



cumbersome alternative. 

In the case of phase dynamics, the drift and diffu- 
sion coefficients must be 27r-periodic, i.e., D^ n '{x) — 
D^(x + 27r). Therefore, it makes sense to define the 
KM coefficients as 

(19) 

Phase dynamics are often governed by Langevin equa- 
tions of the form 

</> = uj + cos(0) + V2DT . (20) 

We consider the case w = 0.2; D = 0.5, so we have 
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= 0.2 + cos(x) and D^ 2 \x) = 0.5 . Fig. [5] shows 
the result in the same representation as in Fig. 01 

V. SUMMARY AND OUTLOOK 

We have presented a novel optimization procedure for 
the estimation of drift and diffusion coefficients for one- 
dimensional Markovian time series that suffer from large 



sampling intervals. The optimization can be performed 
both in a parametric and non-parametric fashion. There- 
fore, it is applicable for a large class of diffusion processes. 
The usefulness of our method is demonstrated in four 
examples with synthetic time series. The method yields 
good results, even if the sampling interval is of the order 
of the typical time scales of the deterministic part of the 
dynamics. 
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